8bitfiles.net/archives

home *** CD-ROM | disk | FTP | other *** search

/ 8bitfiles.net/archives / archives.tar / archives / compuserve-file-archive / 05 Programming / BFASM.DOC < prev next >

Wrap

Text File | 2019-04-13 | 29KB | 596 lines

@ Using the Forth Assembler What this file is: A lot of the mail I have been getting about Blazin' Forth concerns the use of CODE and ;CODE definitions, and combining assembly language with Forth. It recently occurred to me that a lot of people don't know how to use the Forth assembler - and are therefore missing out on a considerable amount of the power of the Forth language. So this is an attempt at a tutorial in using Assembler in Forth, and in using it in the BFC in particular. Much of what I say here should be applicable to other 6502 Forth implementations - but not necessarily all. Consider yourself warned. What this file is not: This is *not* a course in machine language programming. If you need information on the basics of using assembly language, then consult a good text - my personal favorites are the ones by Lance Leventhal, but there are many good ones around. General Considerations: Using CODE and ;CODE words are frowned upon in programs which attempt to be portable. In fact, the quickest way to insure that your programs are not 83 Standard is to use the word CODE in them. Having said this however, there is much to be said for incorporating a few code definitions in Forth programs. Typically, a program will spend most of its time in one or two words - by recoding these words in assembler, the overall speed of the program can be increased manyfold - 50% increases or more are not unusual, and for very little work. Also, the incompatibility is not as great as it would appear at first. It is probably safe to say that the most popular 8-bit personal computers use the 6502 CPU (Apple II, C64 C128, Atari line etc.). As long as your CODE definitions are not accessing special hardware features (such as the SID chip in the C64) most code definitions will work on all computers using the same CPU. I have shared CODE definitions with friends who have Apples and Ataris running Forth, and I have never yet had to modify one. Finally, Forth is the easiest language there is to combine with machine language. Many of the hassles and problems ordinarily associated with combining high level languages and assembler simply do not occur in Forth. In other languages, the typical process is this: 1: Write the high level program, and debug. 2: Figure out where you need to speed things up. 3: Fire up an editor, and write some assembly code. 4: Assemble the file and - usually - run a loader program on that output. 5: Load the Editor, and modify your higher level source. 6: Compile your higher level source. 7: Link the compiled program and the loaded assembly language program. 8: When it doesn't work (which it won't, at first). Go back to step 3. There are also inevitable problems in passing parameters to and from the higher level code, and the inevitable problem of where to put the machine language. If you have ever done much of this sort of thing, you know what a headache these problems can be. These last two problems are usually the most difficult, and, you will be surprised, and possibly relieved to hear that they don't occur in Forth. At all. Ever. Combine that with a resident assembler and a resident editor and a resident compiler, writing Assembly language in Forth becomes almost ridiculously easy. A Quick Example: Since I have taken up so much of your time with the above advertisement, you would probably like me to put my money where my mouth is. Here is a quick example of what I was talking about above. I hope it's not too trivial, but the idea I want to get across here is the ease of combining Assembler with Forth. First, a high level Forth Program: : SHIFTLEFT ( -- N2 N1 ) // shift N2 left N1 times 0 ?DO 2* LOOP ; : SHIFT-4 ( Just show what a left shift is ) 100 0 DO I 4 SHIFTLEFT . LOOP ; Multiple left shifts are fairly common, and so SHIFTLEFT is likely to be a handy word in certain applications. As it is, it will run pretty quickly. But perhaps, for certain speed demons, not quickly enough, so you decide to recode the SHIFTLEFT primitive in Assembler: CODE SHIFTLEFT ( -- N2 N1 ) // shift N2 left N1 times BOT LDA, TAY, BEGIN, 0 # CPY, 0= NOT WHILE, SEC ASL, SEC 1+ ROL, DEY, REPEAT, POP JMP, END-CODE If you are used to conventional assemblers, this probably looks pretty weird. The important thing to notice here is that *only* SHIFTLEFT has changed - SHIFT-4 (or any other word which uses SHIFTLEFT) will work just as it did before, with the only change being the overall increase of speed which machine language naturally brings to any situation. Notice also that we didn't have to worry about where to put the code - it goes in the same spot our higher level SHIFTLEFT went. You will also discover, if you type this example in, that you don't have to call the assembler. This is all taken care of for you. As far as you, another user, or other procedures which use SHIFTLEFT are concerned, there is no difference between using the hi-level SHIFTLEFT and the CODE SHIFTLEFT. The Structure of a CODE definition: It's pretty straightforward. They all look the same: CODE [name] [assembler mnemonics] END-CODE Note the similarity to a colon definition: : [name] [forth words] ; How to Exit a Code Definition: One thing you must remember is that you have to explicitly leave a CODE definition by doing a JMP, to another code level routine. This is probably the single most common error made by newcomers. Possibly it is caused by making a false analogy between the higher level ; and the code level END-CODE. While the Forth word ; does in fact get you back to where you came from, END-CODE does not. In fact, END-CODE does nothing at all at run-time. You can exit a CODE definition by doing a JMP, to any of the following: NEXT POP POPTWO PUSH or PUT . These are described below: NEXT NEXT is commonly called the address interpreter. It is the word that is responsible for the execution of all Forth words. ALL words in Forth ultimately end up here. Doing a NEXT JMP, will cause the current code definition to stop and return to the word that called it. All of the following exit points end with a jump to NEXT . In what follows, remember that a "stack element" refers to a 16 bit quantity -- i.e. two bytes. POP POP first drops the first element of the stack, and then jumps to NEXT. Same as DROP in hi-level forth. POPTWO POPTWO drops the top two elements from the stack, and then jumps to NEXT. Same as 2DROP in hi-level. PUSH PUSH lets you leave a result on the top of the stack. PUSH expects the low byte of the new top of stack to be on the return stack, and the high in the accumulator. PUSH will leave these as the new top of the parameter stack (the former top will then be the second element). PUSH then calls NEXT. Since this routine is somewhat more complicated than the others here is a typical sequence: PHA, ( push low byte to return stack ) TYA, ( assume high byte is saved in Y register, move it to the A reg) PUSH JMP, END-CODE ( parameters set, so jump to PUSH) PUT PUT replaces the top of the stack with a new value. You setup PUT in the same way as you setup for PUSH - the new lobyte is pushed to the return stack, and the new hibyte is in the accumulator before the call to PUT. The only difference between the two is that PUT replaces the present top of the stack, while PUSH creates a new top of stack. Register Usage: Since Forth uses two stacks, and the 6502 CPU only implements one hardware stack, the parameter stack must be "artificially" maintained. In this implementation (as in most) it is located in the zero page, and the X register is used as the parameter stack pointer. The machine stack is Forth's return stack. Therefore instructions which affect the X register or the machine stack should be used with care. At entry to a CODE defintion, the X register points to the top byte of the parameter stack. This stack grows downward in memory, so decrementing the X register will make room for another element on the stack, and incrementing the stack pointer will remove an element from the stack. Here is a diagram which shows the situation when two elements are on the stack: Hi Memory ************** * hibyte2 * * lobyte2 * ************** * hibyte1 * X --> * lobyte1 * Top of Stack ************** Notice that the two bytes which make up one stack entry are stored in the usual 6502 order, with the lobyte lower in memory. To remove the top element of the stack, we can define the code word DROP: CODE DROP ( N -- ) INX, INX, NEXT JMP, END-CODE Which in fact, is exactly the the way DROP is defined in the BFC. After DROP has been executed, the stack looks like this: Hi Memory ************* * hibyte2 * X --> * lobyte2 * New top of Stack ************* We will return to this topic later, when we talk about accessing the parameter stack in more detail. For now, the main point is to remember that when Forth starts executing your code definition, the X register will contain a pointer to the top of the stack. You can use the X register to access the stack, or to remove elements from the stack, but when your code definition is finished, other Forth words, and the Forth system itself is going to expect the X register to contain a valid stack pointer, so don't change it wantonly. You should also remember that since each stack entry is two bytes, only even multiples of the INX, or DEX, instruction make sense. (I.E. INX, INX, INX, INX, not INX, INX, INX, - the first will drop two stack elements, while the second will drop 1 and 1/2 stack elements - and cause the Forth system to behave oddly.) Both the Accumulator (A reg) and the Y register may be freely used. The A register will contain garbage, and must be initialized, but the Y register is guaranteed to be 0 on entry to your code, and you may take advantage of this fact or not, as you wish. Here is a short code definition that will leave the value of -1 (Forth's canonical TRUE flag) on the stack. It uses the PUSH routine described earlier. CODE TRUE ( -- -1 ) DEY, ( Y REG now holds $ff) TYA, PHA, PUSH JMP, END-CODE Here is an even shorter definition which will leave a 0 on the parameter stack: CODE FALSE ( -- 0 ) TYA, ( set A register to 0 ) PHA, ( set up for PUSH ) PUSH JMP, END-CODE Life being what it is, you will often wish you could use the X register. There is a way. You can use the system storage location XSAVE, to temporarily save the value of the X register while you are doing other things. You must remember to restore the X register before exiting, however. A typical sequence is: XSAVE STX, ( stuff that changes x ) XSAVE LDX, NEXT JMP, END-CODE Why it looks so strange.... The main reason the Forth Assembler looks so strange is that it is reverse polish, like all of Forth. Operands *preceed* the operators. Here are some examples that should make it clear: Conventional Assembler Forth's Assembler ===================== ================ LDA # 0 0 # LDA, ROL A .A ROL, STA ADDRESS,X ADDRESS ,X STA, STA (ADDRESS,X) ADDRESS X) STA, LDA (ADDRESS),Y ADDRESS )Y LDA, JMP (INDIRECT) INDIRECT ) JMP, JMP ADDRESS ADDRESS JMP, LDA ADDRESS ADDRESS LDA, While admittedly unusual, it does make the best use of the stack at assembly time. The other major difference is that the Forth Assembler does not use labels. There are no branch instructions - the Forth Assembler uses a structured code approach to control flow. It does this by using analogues of the hi-level IF THEN ELSE etc. to control the flow of your CODE definition. To take advantage of this, you must specify the condition code you want tested. You specify this condition code by using any of the following words: CS test if carry set 0< test if negative flag set 0= test if zero flag set VS test if overflow flag set You can follow these condition code specifiers with not, to test for the opposite condition: CS NOT test if carry clear 0< NOT test if negative clear 0= NOT test if zero flag clear VS NOT test if overflow flag clear Below is an example of a possible definition of 0= , which leaves true if the top of the stack is 0, and false (0) if it is anything else: CODE 0= ( N -- FLAG ) BOT LDA, BOT 1+ ORA, 0= IF, 255 # LDA, ELSE, 0 # LDA, THEN, PHA, PUT JMP, END-CODE In the above code, we first test for 0 by ORing the two bytes which make up the top of the stack together. The result will be zero only if both are zero. We then test the zero flag (with 0=). If the byte is 0, we LDA with 255, otherwise, we LDA with 0, and replace the top of the stack with the flag by jumping to the PUT exit routine. Note that we could make this definition much shorter by taking advantage of the fact that the Y register is zero at entry: CODE 0= ( N -- FLAG ) BOT LDA, BOT 1+ ORA, 0= IF, DEY, THEN, TYA, PHA, PUT JMP, END-CODE You can use the same type of tests to do conditional loops. Here is a do nothing example that simply wastes some time in a loop: CODE WAIT ( -- ) BEGIN, DEY, 0= UNTIL, NEXT JMP, END-CODE This simply decrements the Y register until it becomes zero. In the original Forth assembler for 6502 machines, the BEGIN, UNTIL, structure was the only one available. The BFC has extended this to include BEGIN, WHILE, REPEAT, and BEGIN, AGAIN, . The BEGIN, AGAIN, loop is infinite - you must JUMP out of it in the middle somewhere. Accessing the Stacks. Typically, most routines only need to access the top two elements of the parameter stack. Since this is so common, special words have been provided to make life easier here. BOT references the top of the stack (which is lower in memory, and so the BOTtom of the stack). SEC references the SECond element of the stack. It's important to remember that a Forth stack entry is 16 bits, or two bytes, so to obtain the whole stack element, you need to do two fetches or stores. Here is a sample implementation of DUP: CODE DUP ( N -- N N ) BOT LDA, PHA, BOT 1+ LDA, PUSH JMP, END-CODE First we fetch the low byte of the top of the stack with BOT LDA, . This is pushed onto the return stack, as required by the exit routine PUSH . Next we get the high byte with BOT 1+ LDA, . That's all there is to it. As another illustration, here is an implementation of OVER: CODE OVER ( A B -- A B A ) SEC LDA, PHA, SEC 1+ LDA, PUSH JMP, END-CODE The actual addressing mode being used here is "zero-page x". In conventional assembler, BOT LDA, would be written LDA 0,x while BOT 1+ LDA, would be written LDA 1,X . To access deeper stack elements, you can keep adding values to BOT or SEC , or you can use the addressing mode explicitly: BOT 4 + LDA or 4 ,X LDA While not used often, it is also possible to address directly into the return stack. Typically, you would access the return stack using the PLA, instruction. This has the side effect of altering the stack pointer, and you can also only access the top of the return stack. To access arbitrary bytes, you can use RP) . To do this, you must first save the X-Register in XSAVE and then execute TSX, which will move the stack pointer into the X-Register. You can then do RP) LDA, which will fetch the current top of the return stack. To get deeper into the return stack, offset RP). Below is an example which non-destructively moves the address on the return stack to the top of the parameter stack: CODE GET-RETURN-ADDRESS XSAVE STX, TSX, RP) LDA, PHA, RP) 1+ LDA, XSAVE LDX, PUSH JMP, END-CODE One of the easiest and quickest ways to crash any system is to garbage the return stack. If you need to access the stack, go for it, but use care. SETUP and N It is often useful to be able to access absolute memory locations. To this end, Forth provides an 8 byte temporary data area which is referred to as the N area. You may initialize this area yourself, or you may call SETUP to move stack elements to the N area. To use SETUP, you must load the accumulator with the number of stack elements you want to move (NOTE: the number of stack elements, NOT the number of bytes) and then do a JSR to SETUP. SETUP will pop the elements off of the stack, and move them to the N area. Since there are only 8 bytes in the N area, you have room for at most 4 stack elements. As a simple example, the following will pop the top two elements off of the stack, and move them to the N area: CODE MOVE2 2 # LDA, SETUP JSR, END-CODE The previous top of stack (BOT and BOT 1+) will be stored at N and N 1+ , while the second element (SEC and SEC 1+) will be at N 2+ and N 3 +. Once they have been moved there, you can carry out operations on them, or use them in the indexed indirect Y addressing mode. ( LDA (N), Y in conventional assembler, or N )Y LDA, in Forth assembler.) By far the most common error in using SETUP is to forget that it also pops the elements of the stack as it moves them. ;CODE If CODE is the Assemblers equivalent to the hi-level colon, ;CODE is the assemblers equivalent to DOES>. As an example of the use of ;CODE, we will write our own versions of CONSTANT - one in high level forth, using DOES>, and one in low level, using ;CODE. First the hi-level definition: : CONSTANT CREATE , DOES> @ ; As a quick review, remember that CREATE (or a word that uses CREATE) creates a dictionary entry for the next word in the input stream. Words created in this way all have the same run-time behaviour - they leave the address of their parameter field on the stack. Note that CREATE by itself does not allocate any parameter field space in the dictionary, you must do this yourself by using C, , or ALLOT . The DOES> word allows you to manipulate the values in the parameter field, essentially allowing you to define a special set of related words which share the same run-time behaviour. Lets walk through the process: 10 CONSTANT TEN When Forth executes the above line, the number 10 will be left on the stack, and CONSTANT will be executed. The first word in CONSTANT is CREATE, which will take the first word it finds in the input stream (TEN in this case) and create a header for it in the dictionary, with its associated link field, name field and code field. The code field written by CREATE will cause the address of the first byte of the parameter field to be left on the stack when TEN (the word defined by CONSTANT) is executed. Next , is executed, which takes the top of the stack, and compiles this number in the dictionary. DOES> does something mysterious (more on this later) which causes the words following DOES> to be executed at run-time. This is the compile time behaviour of CONSTANT. Now, when TEN is executed, the run-time behaviour of CONSTANT will occur. The parameter fields address is pushed onto the stack, and then the words after DOES> are executed. In this case, the single word @ will be executed, which replaces the address on the top of the stack with the value at that address. TEN . 10 OK Now for the ;CODE version: : CONSTANT CREATE , ;CODE 2 # LDY, W )Y LDA, PHA, INY, W )Y LDA, PUSH JMP, END-CODE Notice, by the way, that no concluding semi-colon is required when you use ;CODE. When ;CODE is executed (for example, by typing 10 CONSTANT TEN), it will REWRITE THE CODE FIELD OF THE WORD BEING DEFINED! This is extremely important to remember. All Forth words have code fields that define their run time behaviour, with colon definitions all sharing the same code field, variables sharing the same code field (different, of course, from that of colon definitions and constants) and so on. When CREATE executes in our CONSTANT definition, it creates a code field that contains the address of a routine which will push the address of the word being defined to the stack. When ;CODE executes in our CONSTANT definition, it will search out this code field, and re-write it with the address of the machine language routine which immediately follows ;CODE. This means that the only run time behaviour of a word which contains a ;CODE termination is determined by you. Even though there is a CREATE in the lo-level definition of CONSTANT, it will not deposit the address of the parameter field on the stack, because ;CODE has changed the code field to point to our machine language routine. An illustration may make this clearer: Header created by CREATE: ************** * Link Field * ************** * TEN * Name field - stores ASCII characters of defined name. ************** * $0900 * $0900 = address of routine to push parameter field on stack ************** * 10 * Parameter field - stores value of constant ************** Note that $0900 is just an example address - don't try jumping to it in your own code. When a word defined by CREATE is executed, the machine language routine at $0900 (the address in the code field of the word) will be executed. This routine is what causes the run-time behaviour of CREATE defined words - it pushes the address of the parameter field to the stack. The above illustration shows the state of the dictionary after the CREATE and , part of CONSTANT have been executed. Now, suppose that the address of the first instruction following ;CODE in CONSTANT ( the 2 # LDY, ) is at $8000 . When ;CODE executes in the course of defining a new constant, the header will look like this: Header after ;CODE has been executed. ************** * Link Field * ************** * TEN * Name field - stores ASCII characters of defined name. ************** * $8000 * $8000 = address of your machine language routine ************** * 10 * Parameter field - stores value of constant ************** So, as always, when Forth executes TEN, the first thing to be executed is the address of the routine in the code field, which no longer points to the CREATE run time code, but to your machine language code. Hope that is all clear enough. Now, to the particulars of the ;CODE part of our low level CONSTANT definition (note that the following is implementation dependent. Most present day Forths use an implementation along these lines, but not all.) Forth depends on two registers for its performance, the IP and the W register. When Forth is implemented on processors with more registers, typically a processor register will be used for these Forth system registers. However, there are not enough on the 6502, so zero page locations have been used instead. W will contain the address of the code field of the word currently being executed. For example, if the code field of TEN is located at $9000, the W register will contain $9000, or the address of the address of the routine to be executed. The ;CODE part of CONSTANT uses this fact to access the parameter field of the current word. Since the code field of an address, is always two bytes long, the first byte of the parameter field is located at $9002. Given this information, it becomes a simple matter of indirectly indexing from W. Loading the Y register with 2, and then performing W )Y LDA, (LDA (W),Y in conventional assembler) will get the first byte of the constants value. The next byte is at $9003, so we simply increment Y, and indirectly index again to get the next byte. The rest of the definition consists of setting up for PUSH, which was covered earlier. Phew! Hope that was clear enough. I think you can see that indexing from W can get you any byte in a definitions parameter field. W 1- contains an indirect JMP instruction, so you can also vector control to other routines by storing the address of the address of the routine in W, and then doing a W 1- JMP, . This technique is rarely used, however. There is an intimate relationship between the IP register and the W register, and an effective use of these registers depends on a clear understanding of how they work in the Forth system as a whole. Both of them are changed by NEXT. When NEXT executes, the IP will point to an address which contains the code field address of the next word to be executed by Forth. NEXT fetches this address, and stores it in W. It then bumps the IP (which stands for Interpretive Pointer) by two, so the next time around it will be pointing to the next word to be executed. Finally, NEXT does a JMP to W 1-, which causes an indirect jump to the address pointed to by W. The IP is most useful for accessing things like inline data structures, (character strings, and the like.) Since the IP is incremented by NEXT, the code fragment: IP )Y LDA, PHA, INY, IP )Y LDA, PUSH JMP, Will push the address which contains the address of the next Forth word to be executed to the parameter stack. (This will be the third and fourth bytes past the address of the current word being interpreted.) Obviously, by storing new values in the IP, and then jumping to NEXT , you can also force the execution of a particular definition. This technique is also rarely used. Typical uses for the IP are mainly accessing inline data. Accessing Variables from Code definitions. Since variables leave the address of their parameter fields on the stack, it is a simple matter to access these variables from CODE level definitions. Here is an example: VARIABLE FOO CODE FOO+1 FOO LDA, CLC, 1 # ADC, FOO STA, CS IF, FOO 1+ INC, THEN END-CODE This is a common way to increment the value at a memory location. The assembler also provides a way to access USER variables, using the UP (user pointer constant). User variables must be accessed using an offset, so the )Y addressing mode is recommended. You will need to know the offset of the user variable you need to access in advance, of course. MISC. Since the Assembler is co-resident with the FORTH system, all the power of FORTH is available to you when using the Assembler. As one example: FOO @ 2 3 */ # LDA, will initialize the accumulator to two thirds of the value stored in the variable FOO. (Of course, this must be 255 or less.) Occasional conflicts may arise, however. In particular, a common error is to confuse the assemblers 0= and 0< words with Forths - they are not the same. If you wish to use the Forth versions while assembling, you must explicitly enter the FORTH vocabulary, do your Forth thing, and then re-enter the assembler vocabulary. Such conflicts are rare, and usually easily recognized. The Forth assembler uses the standard MOS mnemonics for the 6502 op-codes, but each mnemonic has a ',' attached. Thus, in forth, we write LDA, BRK, or JMP, not LDA BRK and JMP. Also, the Assembler conditionals use the same convention - IF, ELSE, THEN, and not IF ELSE THEN . A common error is to omit the comma from one or more of these conditionals. I am personally not wild about this convention, but the first Forth assemblers used them, and now we are stuck with it. You may have noticed that I have not used the JSR, RTS, instructions. This is because typically, CODE definitions are called from higher level words, and must end with a JMP, to NEXT or the equivalent routine. It is possible to write programs which are structured in this way. Typically, the subroutines are given names with CREATE, and then called from a higher level code definition. Here is an example: CREATE BLETCH ASSEMBLER NOP, NOP, RTS, Note that when we do this, we must invoke the Assembler vocabulary explicitly. BLETCH would be typically called from a higher level CODE definition: CODE THE-GREAT-RTS-HACK BLETCH JSR, NEXT JMP, END-CODE This is only occasionally necessary however - usually when writing code that is extremely time critical, such as graphics code. Finally, it is possible to exit a code definition by jumping to yet another code routine. You must remember to restore the X register (if you have altered it) and it is also wise to reset the Y register to 0, since many code level definitions will assume that this is the case. As a simple example, here is a word that simply simply does a JMP, to Forth's KEY routine: CODE CALL-KEY ' KEY @ JMP, END-CODE The tick (') gets the address of KEY's code field, and the @ fetches the address of the routine which is stored in KEY's code field. This is the recommended technique. It is more portable, and also safer than others which one sees. Note that you cannot call hi-level Forth words using this technique - only code level definitions may be called in this way. If you think you want to exit a CODE definition with a call to a higher level Forth word, think again. If you still need to do it, then start tinkering with the IP and W registers. I think that about covers it. I have tried to cover all of the basics, and many of the more advanced techniques in combining assembly language and Forth. If some of it seems obscure, it is probably my explanation, since hacking in CODE is really no more difficult than hacking in any language - it just runs faster. Good Luck And Happy Hacking! SDB